Pointer aliasing

In computer programming, aliasing refers to the situation where the same memory location can be accessed using different names.

For instance, if a function takes two pointers A and B which have the same value, then the name A[0] aliases the name B[0]. In this case we say the pointers A and B alias each other. Note that the concept of pointer aliasing is not very well-defined – two pointers A and B may or may not alias each other, depending on what operations are performed in the function using A and B. For instance, in memcpy(A+8, A, 8), the destination and source pointers do not alias, but in memmove(A+8, A, 9) they do.

Aliasing and re-ordering

Aliasing introduces strong constraints on program execution order. If two write accesses which alias occur in sequence in a program text, they must occur in sequence in the final machine code. Re-ordering the accesses will produce an incorrect program result (in the general case). The same is true for a write access and a read access.

However, if two read accesses which alias occur in sequence in a program text, they need not occur in the same sequence in the machine code - aliasing read accesses are safe to re-order. This is because the underlying data is not changed by the read operation, so the same value is always read.

It is vitally important that a compiler can detect which accesses may alias each other, so that re-ordering optimizations can be performed correctly.

Several strategies are possible:

In C, any function pointer argument may alias any other function pointer argument. The compiler must assume that any accesses through these pointers can alias. This dramatically restricts the potential for optimization.

In C++, pointer arguments are assumed not to alias if they point to fundamentally different types ("strict aliasing" rules). This allows more optimizations to be done than in C.

In C99, the restrict keyword specifies that a pointer argument does not alias any other pointer argument.

In FORTRAN, function arguments may not alias each other, and the compiler assumes they do not. This enables excellent optimization, and is one major reason for FORTRAN's reputation as a fast language. (Note that aliasing may still occur within a FORTRAN function. For instance, if A is an array and i and j are indices which happen to have the same value, then A[i] and A[j] are two different names for the same memory location. Fortunately, since the base array must have the same name, index analysis can be done to determine cases where A[i] and A[j] cannot alias.)

In pure functional languages, function arguments may usually alias each other, but all pointers are read-only. Thus, no alias analysis needs to be done.

Aliasing and multi-threading

If multiple threads have names which alias the same memory location, two problems arise.

Firstly, unless the memory location is protected in some way, then read-modify-write operations which are non-atomic might be interleaved with similar operations on another thread, producing incorrect results. In this case, some form of synchronization primitive must be used (e.g. a critical section).

Secondly, even with synchronization, if two aliasing accesses occur in different threads in a non-ordered way, the program behaviour may become non-deterministic - different runs of the same program on the same data may produce different results. Thus, it is often important to impose an explicit ordering between threads which have aliases.

External links